<
memory management> /kash/ A small fast memory holding
recently accessed data, designed to speed up subsequent access
to the same data. Most often applied to processor-memory
access but also used for a local copy of data accessible over
a network etc.
When data is read from, or written to,
main memory a copy is
also saved in the
cache, along with the associated main memory
address. The
cache monitors addresses of subsequent reads to
see if the required data is already in the
cache. If it is (a
cache hit) then it is returned immediately and the main
memory read is aborted (or not started). If the data is not
cached (a
cache miss) then it is fetched from main memory
and also saved in the
cache.
The
cache is built from faster memory chips than main memory
so a
cache hit takes much less time to complete than a normal
memory access. The
cache may be located on the same
integrated circuit as the
CPU, in order to further reduce
the access time. In this case it is often known as {primary
cache} since there may be a larger, slower
secondary cache
outside the CPU chip.
The most important characteristic of a
cache is its
hit rate
- the fraction of all memory accesses which are satisfied from
the
cache. This in turn depends on the
cache design but
mostly on its size relative to the main memory. The size is
limited by the cost of fast memory chips.
The hit rate also depends on the access pattern of the
particular program being run (the sequence of addresses being
read and written).
Caches rely on two properties of the
access patterns of most programs: temporal locality - if
something is accessed once, it is likely to be accessed again
soon, and spatial locality - if one memory location is
accessed then nearby memory locations are also likely to be
accessed. In order to exploit spatial locality,
caches often
operate on several words at a time, a "
cache line" or "
cache
block". Main memory reads and writes are whole
cache lines.
When the processor wants to write to main memory, the data is
first written to the
cache on the assumption that the
processor will probably read it again soon. Various different
policies are used. In a
write-through cache, data is
written to main memory at the same time as it is cached. In a
write-back cache it is only written to main memory when it
is forced out of the
cache.
If all accesses were writes then, with a write-through policy,
every write to the
cache would necessitate a main memory
write, thus slowing the system down to main memory speed.
However, statistically, most accesses are reads and most of
these will be satisfied from the
cache. Write-through is
simpler than write-back because an entry that is to be
replaced can just be overwritten in the
cache as it will
already have been copied to main memory whereas write-back
requires the
cache to initiate a main memory write of the
flushed entry followed (for a processor read) by a main memory
read. However, write-back is more efficient because an entry
may be written many times in the
cache without a main memory
access.
When the
cache is full and it is desired to
cache another line
of data then a
cache entry is selected to be written back to
main memory or "flushed". The new line is then put in its
place. Which entry is chosen to be flushed is determined by a
"
replacement algorithm".
Some processors have separate instruction and data
caches.
Both can be active at the same time, allowing an instruction
fetch to overlap with a data read or write. This separation
also avoids the possibility of bad
cache conflict between
say the instructions in a loop and some data in an array which
is accessed by that loop.
See also
direct mapped cache,
fully associative cache,
sector mapping,
set associative cache.
(1997-06-25)